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Nonnormal  Multivariate  Distributions:  Inference 
based  on  Elliptically  Contoured  Distributions  * 

T.  W.  Anderson* 

Stanford  University 


1.  Introduction. 

The  classical  or  conventional  multivariate  analysis  is  based  largely  on  the  multi¬ 
variate  normal  distribution.  This  probability  model  fits  many,  though  not  all,  sets  of 
continuous  multivariate  data.  The  theory  and  methodology  of  inference  for  this  model 
is  highly  developed  and  has  been  exposited  extensively.  The  nature  of  the  normal  dis¬ 
tribution  permits  considerable  analysis  in  terms  of  conventional  matrix  algebra.  The 
fact  that  the  parameter  set  consists  of  a  vector  and  a  matrix  that  can  be  interpreted 
as  the  mean  of  the  observation  vector  and  its  covariance  matrix  makes  inference  rel¬ 
atively  easy  to  interpret  and  simplifies  the  analysis.  Of  course,  these  advantages  of 
simplicity  are  also  disadvantages  of  inflexibility  that  restrict  the  applicability.  It  is 
useful  therefore,  to  extend  multivariate  probability  distributions  beyond  the  normal 
class. 

In  this  lecture  we  shall  describe  a  larger  class  of  distributions,  thus  augmenting 
the  scope  of  analysis.  The  set  of  nonnormal  distributions,  of  course,  is  very  wide;  we 
cannot  hope  to  cover  more  than  a  portion  of  this  field.  In  the  International  Sympo¬ 
sium  on  Multivariate  Analysis  and  Its  Applications  held  in  Hong  Kong  in  March  1992 
Ingram  Olkin  gave  a  paper  entitled  “Multivariate  Nonnormal  Distributions.”  The  top¬ 
ics  included  bivariate  binomial  distributions,  bivariate  Poisson  distributions,  bivariate 
exponential  distributions,  and  multivariate  distributions  with  given  marginal  distri¬ 
butions;  none  of  these  subjects  will  be  included  in  this  present  paper.  The  title  of 

*The  first  C.G.  Khatri  Memorial  Lecture  given  at  Pennsylvania  State  University,  May  8,  1992. 

*  Research  supported  by  the  U.  S.  Army  Research  Office  Contract  No.  DAAL03-89-K-0033  at 
Stanford  University 
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William  Cleveland’s  presentation  to  the  Hong  Kong  conference  was  “Computer  Inten¬ 
sive  Methods  and  Graphical  Methods  for  Analyzing  Multivariate  Data;”  his  approach 
was  that  of  data  analysis  -  another  subject  I  am  not  including  here. 

My  paper  is  devoted  to  the  exposition  of  elliptically  contoured  distributions  and 
statistical  inference  appropriate  to  such  distributions.  This  class  of  distributions, 
provides  more  flexibility,  specifically,  it  permits  nontrivial  kurtosis;  the  marginal  dis¬ 
tributions  can  have  long  tails.  At  the  same  time  much  of  the  structure  of  the  normal 
distribution  is  retained. 

As  we  shall  see,  many  of  the  statistical  methods  appropriate  to  normal  parent 
distributions  are  also  suitable  for  a  more  general  class  of  elliptically  contoured  distri¬ 
butions,  but  since  the  kurtosis  in  an  elliptically  contoured  distribution  may  be  quite 
different  from  the  null  kurtosis  of  the  normal  other  methods  are  often  needed.  Such 
methods  may  be  more  robust  than  normal  methods,  which  are  usually  based  on  lin¬ 
ear  and  quadratic  functions  of  the  observations.  Not  only  does  this  larger  class  of 
distributions  call  for  new  methods,  the  class  forms  an  excellent  framework  in  which 
to  study  and  evaluate  robust  procedures. 

It  can  be  expected  that  in  the  future  much  more  attention  will  be  paid  to  the 
elliptically  contoured  distribution.  This  paper  will  point  to  some  important  aspects. 

Chmielewski  (1981)  has  given  a  review  of  the  papers  on  spherically  contoured  and 
elliptically  contoured  distributions  that  appeared  before  1980.  He  mentions  Maxwell 
(1860),  Bartlett  (1934),  and  Hartman  and  Wintner(1940)  as  three  of  the  earliest 
papers.  Kelker  (1970)  developed  some  of  the  properties  of  spherically  and  elliptically 
contoured  distributions.  A  recent  summary  is  given  in  Fang  and  Zhang  (1990).  See 
also  Fang  and  Anderson  (1990). 
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2.  The  Normal  Distribution 


2.1.  General 

The  normal  distribution  of  a  (nondegenerate)  p-component  random  vector  X  = 
(Xi,  •  •  • ,  XPY  has  a  density  which  can  be  written 

1  -Ux-t/YA~\x-u)  (2  n 

(27T)P/2|/l|1/2e  ’  K  ' 

where  u  is  a  p-component  vector  and  A  is  a  positive  definite  matrix.  Integration 
shows  that  the  mean  vector  and  covariance  matrix  of  X  are 

ex  =  i/,  £{x  -  ex)(x  -  exy  =  a,  (2.2) 

respectively,  There  is  mnemonic  advantage  in  re-naming  this  vector  v  and  this  matrix 
A  as  n  =  (/il5  •  •  • ,  /ip)'  and  27  =  (<7,j),  respectively.  Hence  the  density  of  X  is 


(2t)p/2|.£|i/2' 

we  write  X  ~  JV(/i,  X). 

The  characteristic  function  of  X  is 


£eitX  =  e-\t'Et+it'n 


The  moments  of  X  up  to  order  4  are 


ex  =  £(X  -  fi)(X  -  ti)'  =  27,  (2.5) 

£(Xi  -  nMXj  -  N){Xk  -  fiky  =  0,  (2.6) 

£(Xi  -  m){Xj  -  Hj,  )(Xk  -  Hk){Xi  -  m)  =  <rtJakl  -(-  <7ik<7ji  +  cr,i<Tjk  (2.7) 
Every  moment  of  odd  order  is  0.  The  contours  of  constant  density  are  ellipses 


(x  -  /i)'27  *(x  -  n)  =  const. 


(2.8) 


2.2.  The  spherical  normal  distribution 

Let  A  be  any  nonsingular  matrix  satisfying 


AA!  =  27. 

(2.9) 

Define 

Y  =  A~\X  -  pi). 

(2.10) 

Then  the  density  of  V  is 

1  e-hW 
(2  jr)p/a 

(2.11) 

The  characteristic  function  of  Y  is 

’ 

1 

<0 

II 

>1 

00 

(2.12) 

The  moments  of  Y  of  order  up  to  4  are 

EY  =  0,  EYY'  =  Jp. 

(2.13) 

£YtY}Yk  =  0, 

(2.14) 

EYiYjYkYi  =  -f  SikSji  + 

(2.15) 

where  6„  =  1  and  <5(J  =  0,  i  ^  j.  Every  moment  of  odd  order  is  0. 
constant  density  axe  spheres  centered  at  the  origin. 

The  contours  of 

Define 

R2  =  ||yj|2  =  Y'Y, 

(2.16) 

-1 0! 

II 

II 

b 

(2.17) 

Then 

R2  =  xl. 

(2.18) 

where  (2.18)  means  R 2  is  distributed  as  \2,  the  chi-squared  random  variable  with  p 
degrees  of  freedom.  The  density  of  W  =  R2  is 


1 

2P/*T{p/2) 


(2.19) 
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The  vector  U  has  the  uniform  distribution  on  the  unit  sphere 

uu'  =  1;  (2.20) 

that  is,  the  distribution  of  PU  is  that  of  U  for  any  (fixed)  orthogonal  matrix  P.  We 
write  this  as 

PU  =  U.  (2.21) 

The  scalar  R  and  the  vector  U  are  independent.  We  can  represent  Y  as 

Y  =  RU,  (2.22) 

and  we  can  represent  X  as 

X  =  n  +  RAU.  (2.23) 

If  £  is  of  rank  r,  then  we  can  write  £  =  AA1  with  Ap  x  r.  If  Y  ~  N(0,Ir),  we 
can  represent  X  as  AY  +  i/,  where  R2  ~  xl  and  U  has  the  uniform  distribution  on 
uu'  =  1  in  r  dimensions.  The  characteristic  function  of  X  is  (2.4). 

The  moments  of  Y  are  the  products  of  the  corresponding  moments  of  R  and  those 
of  U.  Since  the  first  two  moments  of  R2  are 

SR2=p ,  SR4=p(p  +  2),  (2.24) 

the  odd-ordered  moments  of  U  are  0, 

SUIT  =  jffSYY'  =  i/,.  (2.25) 

-Alijtu  +  6.ki,i  +  MjO-  (2-26) 

P\P  +  2) 

3.  Elliptically  Contoured  Distributions. 

3.1.  Spherical  distributions. 

Analogous  to  the  normal  distributions  a  spherical  distribution  in  general  can  be 
characterized  in  several  ways  as  follows: 
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1.  If  Y  (p  x  1)  has  a  density,  it  is  of  the  form  y(y'y)>  where  g{y'y )  >  0  and 

/  •  /  s(y'y)<*y  =  1-  (3-1) 

J — oo  %/oo 

Contours  of  constant  densities  are  spheres:  yfy  =  const.  However,  Y  may  have  a 
spherical  distribution  even  though  a  density  does  not  exist.  For  example,  the  vector 
U  defined  in  Section  2  has  a  spherical  distribution. 

2.  For  every  orthogonal  matrix  P 

Y  =  PY.  (3.2) 

If  Y  has  a  density,  (3.2)  follows  from  the  form  g(y'y). 

3.  Property  2  implies  that  the  characteristic  function  of  Y  has  the  form 

zet>Y  =  (3.3) 

4.  The  random  vector  Y  has  the  representation 

Y  ±  RU,  (3.4) 

where  R>  0,  U  has  the  uniform  distribution  on  u'u  =  1,  and  R  and  U  are  indepen¬ 
dent.  The  density  of  R  is  found  from  g{y'y)  by  transforming  to  polar  coordinates  and 
integrating  out  the  p  —  1  angles.  (See  Anderson,  (1984),  Problems  1  to  4,  Chapter  7, 
for  example).  The  resulting  density  is 


ci  \  2itjp  1  p_i  ( 

f{r)=rw2)  s{ 

We  note  that  £Rh  <  oo  if  and  only  if 

J  rh+p~l g(r2)dr  <  oo. 

We  can  write  the  characteristic  function  of  Y  as 

<f>{t't)  =  £eiRt'U  =  u){r2t't)f{r)dr, 
J  0 


(3.5) 


(3.6) 


(3.7) 


u /(«'*)  =  £eiS'U 


where 


(3.8) 


is  the  characteristic  function  of  U. 

5.  If  ||a|j  =  ||6||,  then 

a'Y  ±  6'y.  (3.9) 

We  shall  denote  the  distribution  of  Y  with  characteristic  function  (3.3)  as  Sp(<j>). 


3.2.  Eiliptically  contoured  distributions  in  general. 

Define 

X  =  n  +  AY ,  (3.10) 

where  A  is  a  nonsingular  matrix  such  that 

AA'  =  A.  (3.11) 

1.  The  density  of  X  is 

I  A\~xg[{x  -  n)' A~\x  -  ti)\. 

3.  The  characteristic  function  of  X  is 

Eei8'X  =  ei8'»4>{8'A8). 

4. 

X±n  +  RAU,  (3.14) 

where  R  and  U  were  defined  above. 

Contours  of  constant  density  are 

(*  -  /i)'A-1(*  -  n)  =  const.  (3.15) 

We  shall  denote  the  distribution  of  X  with  characteristic  function  (3.13)  as 
ECp{n,A-,<t>). 


(3.12) 

(3.13) 
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3.3.  Moments. 


The  moments  of  X  can  be  found  from  the  moments  of  R  and  U ,  which  are 
independent.  The  moments  of  U  were  given  in  Section  2.  We  find 


SX=fx, 

£(X  -  /.)(*  -**)'  =  —  A  =  2, 

P 


(3.16) 

(3.17) 


£(Xi  -  mXXj  -  H)(Xk  -  fik)  =  o,  (3.18) 

In  fact,  all  moments  of  X  —  n  of  odd  order  are  0.  The  fourth-order  moments  are 
obtained  from  (2.26)  and  (3.10)  as 


S(Xt  -  pt)(X,  -  m)(Xk  -  fik)(X,  -  m) 

£R* 

=  p(p  +  ^ ik ^ 

£&  P  ,  , 

=  '{ER'Y  p+~2^'3<Tkl  +  aik<T’1  +  a,l<Tjk^ 

The  first  moments  of  R  are  related  to  the  characteristic  function  4>{-)  by 

SR2  =  -2p<t>\0), 


(3.19) 


(3.20) 


E#  =  4p(p  +  2)0"(O). 


(3.21) 


The  fourth  cumulant  of  the  t-th  component  of  X  standardized  by  its  standard  devi¬ 


ation  is 


£(x.  -  >■<)«  -  mxj.  -  13m 

[£(Xi  -  *)*)>  (mf  ( •  ’ 

JW  p  ,1 


3  [(£/P)ap  +  2  1 

_  3 Lm-il 
~  lim? 

=  3k, 
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say.  Note  that  the  fourth  cumulant  is  3/c  for  every  component  of  X.  The  fourth 
cumulant  of  Xi,X},Xk,  and  Xi  is 

Kijkl  =  £{Xi  -  m)(Xj  -  N)(Xk  -  fik)(Xl  -  fil)  -  ( <Tij(Tkl  +  <Tik<Tjl  +  <7il<Tjk) 

=  K(<Tij<Tkl  +  +  OilOjk)-  (3.23) 


3.4.  Marginal  and  conditional  distributions. 

The  characteristic  function  of  a  linear  function  of  X,  say  Z  =  BX.  is 

£eitZ  =  £eitBX 


=  cd'B^ 


(3.24) 


by  use  of  (3.13).  This  shows  that  Z  has  the  distribution  ECp{Bn,  BAB1;  <j>).  In 
particular,  if 


(3.25) 


where  Xt  has  pi  components,  then  X\  has  the  distribution  ECp(fi^\  27n;  <j>). 

We  can  also  characterize  marginal  distributions  in  terms  of  the  representation 
(3.14).  Consider 

y(i) 


Y  = 


y(2) 


(3.26) 


where  Y*1*  and  LT(1)  have  p\  components  and  and  have  p2  components 
(pi  4-  pi  —  p).  Then  R2  =  has  the  distribution  of  R2U f/(1\  and 

XjWijW  y(i)'yU) 


UW'U(X)  = 


(3.27) 


U'U  Y'Y 

In  the  case  Y  ~  N{ 0,  Ir)  (3.27)  has  the  beta  distribution,  say  B(pi,p2),  with  density 

r(p/2) 


r(pi/2)T(p2/2) 

Hence,  for  arbitrary  Sp{4>) 


25P>-»(1  -  0  <  2  <  1. 


Y(1)  =  fl,V, 


(3.28) 


(3.29) 
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where  R2  =  R2b,  b  ~  B(pi,p2),  V  has  the  uniform  distribution  on  v'v  =  1  in 
pi  dimensions,  and  fE2,ft,  and  V  are  independent.  All  margined  distributions  are 
elliptically  contoured. 

Now  suppose  that  A  satisfying  (3.11)  is  lower  triangular.  It  can  be  partitioned  as 


Then  the  first  pj  rows  of  (3.14)  yield 

X<»  4  M<'>  +  .4ny<» 


=  n^  +  AnRiV. 

(3.31) 

Suppose  X  has  the  density 

g{{x  -  n)' A~\x  -  n)} 

(3.32) 

—  g  |  [x(1)  —  —  B(x(2)  —  /x(2))]  An.i  [x*1}  —  /i(1)  —  B(x(2) 

-  /*(2))]  +  Qt]  , 

where  B  =  AijA,, ,  An.,  =  A,,  —  AuAjj1  A21,  and 

(3.33) 

[See  Anderson  (1984),  Chapter  2,  Problem  58,  and  Theorem  A.  3.1.]  The  conditional 
density  of  X*1*  given  X*2)  =  x^  is  (3.32)  divided  by  giiQi),  where  p2(’)  is  the 
marginal  density  of  at  x^2K  Note  that  pi.2(-)  is  the  density  of  an  elliptically 
contoured  distribution.  From  (3.32)  it  follows  that 

5(X<l>|aj<2))  =  /*<*>  +  B(z(2)  -  /z<2)),  (3.34) 

Var(X(1)|*<2))  =  M*(2)Mn.2,  (3.35) 

where  h(x(2))  is  a  nonnegative  function  of  x*2).  Note  that  the  conditional  expectation 
of  X<1>  given  is  the  same  as  for  the  normal  distribution  and  the  conditional 
covariance  matrix  is  proportional  to  that  to  that  for  the  normal.  In  this  sense  the 
structure  of  the  normal  distribution  is  maintained. 
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3.5.  Examples. 

1.  The  multivariate  t- distribution.  Suppose  Z  ~  JV(0,  Jp),  ms2  =  Xm>  and  % 
and  s2  are  independent.  Define  Y  =  (1  /s)7 .  Then  the  density  of  Y  is 


r(s±«) 


r(f)mP/2xP/2 


/  ,  V  -2±J 

■♦£  ' 


and 


P  P  P’n  P  Xm' 


If  X  =  n  +  Ay,  the  density  of  X  is 


r(=?) 


|A|* 


1  + 


m 


(3.36) 

(3.37) 


(3.38) 


2.  Contaminated  normal.  The  contaminated  normal  distribution  is  a  mixture  of 
two  normal  distributions  with  proportional  covariance  matrices  and  the  same  mean 
vector.  The  density  can  be  written 

(1  _  £) - 1 -  -Hx-nyA-l(x-n)  +  £ - I -  e-j-clx-ii)'A'l{x-n)  (3  39) 

;(2tt)p/2|A|5  (2jr)p/2|A|* 


where  c  >  0  and  0  <  e  <  1.  Usually  e  is  rather  small  and  c  rather  large. 


4.  Sampling. 


4.1.  The  density  and  characteristic  function 

A  random  sample  from  ECp(n,  A;  <j>)  consists  of  N  vectors  X\,  X2,  •  ■  • ,  X at.  The 
density  of  the  sample  is 


N 


iai~*  n  sk*«  -  p>'A  v.  -  /*»• 


(4.1) 


cr=l 


The  characteristic  function  of  the  sample  is 

N  r 


t-x-  =  n  "  n 


Or=l  L 


(4.2) 


Or=l 
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In  the  case  of  the  normal  distribution  the  density  and  characteristic  function  are 
based  on 


S(W)  =  (2i)f/»e 

(4.3) 

<t>(v)  =  e~v*2. 

(4.4) 

The  density  of  •  •  • ,  Xn  is 

N  1  r  1  1 

TT - r  exp  --(*a  —  n)'A~x(xa  —  /i) 

ii(27T)P/2|yl|§  Fi  2l  W  1 

=  (2x)-p7V/2|A|-JV/2exp  tr  A'1  J^(*  -  ft)(x  -  ft)' 

.  £  0-1 

(4.5) 

=  (27r)-pAr/2|/i|-iV/2exp  i  {tr  A~l  A  -f  N(x  —  ft)'A~1(a 

6 -#*)]}• 

where 

A  =  £(*«  -  x)(*o  -  *)', 
a 

(4.6) 

1  N 

0=1 

(4.7) 

Display  (4.5)  shows  that  A  and  x  are  sufficient  for  A  and  ft  and  A  and  x  are 
independent.  In  fact  x  ~  N[ft,(l/N)A\  and  A  ~  W(A,n),  where  W{A,n)  denotes 
the  Wishart  distribution  with  covariance  matrix  A  and  n  =  N  —  1  degrees  of  freedom. 
That  A  and  x  are  sufficient  statistics  and  are  independent  is  due  to  the  fact  that 
g(w)  is  exponential.  These  properties  do  not  hold  for  other  elliptically  contoured 
distributions. 

4.2.  The  asymptotic  distribution  of  the  sample  mean  and 
covariance  matrix 

We  define  the  sample  covariance  matrix  as 

S  =  -A,  (4.8) 

n 
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where  n  =  N  —  1  is  the  number  of  degrees  of  freedom.  Then  the  sample  mean  and 
covariance  matrix  are  unbiased  estimators  of  the  model  mean  and  covariance  matrix: 


£x  =  n,  £S  =  S.  (4.9) 

By  the  law  of  large  numbers  they  axe  consistent  estimators  as  N  — ►  oo: 

x  A  /*,  S  A  S.  (4.10) 

The  covariances  of  x  and  S  are 

Cov  (i)  =  jjE,  £{sij  -  <nj){x  -  n)  =  0  (4.11) 

/c  X 

Cov(s,j,  Ski)  =  -jy{<Tij(Tkl  +  <Tik<7jl  +  CruVjk)  +  -{crikVjl  +  <7il° jk)-  (4.12) 

Then  as  N  — ►  oo 

nCov  (s,j,  Ski)  -»  (1  +  n)(aik(Tji  +  crucrjk)  +  K°ij°ki,  (4.13) 


It  will  be  convenient  to  use  more  matrix  algebra.  Define  vec  B,  B  ®  C  (the 
Kronecker  product)  and  Kmn  (the  commutator  matrix)  by 

vec  B  =  vec  (61 ,  •  •  • ,  6n)  =  ^  ,  (4.14) 

bn 

'  bnC  •••  blnC  ' 

B®C=  :  I  ,  (4.15) 

.  bmi  C  • '  •  bmnC 

KmnvecB  =  vec  B\  (4.16) 

See,  for  example  Magnus  and  Neudecker  (1979).  We  can  rewrite  (4.13)  as 

n  Cov(vecS)  =  £(vec S  —  vec £)(vecS  —  vec £)’ 

-*  (k  +  l)(/pS  +  K„){S  0  T)  +  «vec  SiyecS)’.  (4.17) 
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Then 


(*  -  /*)' 
vec  S  —  vec  £ 


(4.18) 


N 


£ 

0 


0 

(k  +  1  )(/pa  +  K„)(£  0  17)  -f  «  vec  £  (vec  i7) 


by  the  central  limit  theorem  for  independent  identically  distributed  random  vectors 
(with  finite  fourth  moments).  This  statement  forms  the  basis  for  large-sample  infer¬ 
ence. 


4.3.  Functions  of  sample  covariances 

Define 

s  =  vecS,  tr  —  vec  £.  (4.19) 

Consider  /(*),  a  vector-valued  function.  Under  the  usual  regularity  conditions 

•ftlM-n*)  ]  =  -  <r)  +  Op(l)  (4.20) 

4  N  (2(‘  +  "H27  ®  *)  +  «"'!  (^£r)  }  • 

Functions  of  the  sample  covariance  matrix  are  also  asymptotically  normally  dis¬ 
tributed. 

Note  that  if  [df(<r)/d<r'\<r  =  0  the  covariance  matrix  in  (4.20)  is  simply  a  multiple 
of  the  covariance  matrix  when  sampling  from  a  normal  distribution.  Suppose  /(•)  is 
scale  invariant  (homogeneous  of  degree  zero);  that  is, 


f(cs)  =  /(«),  Vc  >  0, 

VSp.d.. 

(4.21) 

Then 

Of(cs)  a/(c«)  d{ca) 

dc  da1  dc 

_  df(ca) 

"  9*  ’ 

(4.22) 

that  is,  (for  c  =  1) 

df{trK  -  o 

3<r'  <r-°- 

(4.23) 
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Then 


%AT[/(.)  -  /(<r)l  A  AT  |o,  2(1  +  ®  £)  j  J  i  (4.24) 

that  is, 

l^Jvjo,  2^)(r®27)[^]'}.  (4.25) 

Note  that  the  normal  distribution  in  (4.25)  does  not  depend  on  k  [tnat  is,  <?(■)]■ 

This  result  applies  to  any  sequence  of  random  positive  definite  matrices  Wn,  such 
that  Wn  fi  and 

y/n(ve cwn  —  vecu )  -i  iV(0,Tj(/p2  ■+■  Kpp)(S7  ®  fi)  +  t2ww'],  (4.26) 


where  wn  =  vec  Wn  and  <jJ  =  vec  Si.  Then 


v«(/(«„)-/(«)]-iw|o,2r,^^(«®«)  J.  (4.27) 


Tyler  (1983)  gave  the  above  result  in  Theorem  1. 
Example.  Correlation  coefficients.  Let 


r«i  = 


fSaSjj 


Pa  = 


<G\\  &jj 


be  the  sample  and  model  coefficients.  The  limit  distribution  of 


(4.28) 


n  +  K 


(rij  sij)i  *,j  —  1 1‘  "■>  Pi 


(4.29) 


is  the  same  as  for  5  having  a  Wishart  distribution. 

Example.  Eigenvectors  and  ratios  of  eigenvalues.  The  eigenvalues  of  the  sample 
covariance  matrix  satisfy 

| S  -  AJ|  =  0  (4.30) 


The  eigenvectors  satisfy 


Sx  =  A,*,  i  —  1,  •  •  •  ,p. 


(4-31) 


For  p  =  2  there  is  an  angle  6  such  that 
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S  (  cost  siB)  )  =  (  cm6  8inS  'l  (  A'  0  V  (4.32) 

\  -sin#  cos 6  J  -sin0  cos#  /  \  0  A 2  ) 

The  normalized  eigenvectors  are  (cos  0,  sin  6)  and  (—  sin  0,  cos  6).  The  angle  6 
and  the  ratio  of  eigenvalues  Xi/Xj  are  scalar  invariant.  Hence,  they  have  the  same 
asymptotic  normal  distribution  after  correcting  for  the  kurtosis  as  when  sampling 
from  the  normal  distribution. 


4.4.  Likelihood  ratio  criteria. 

For  normal  distributions  usually 


-  2  log  LRC  4  x) 


(4.33) 


under  the  null  hypothesis  H.  Consider  a  scalar  function  h(s)  such  that 


Then 


h(<r)  =  0,  =  0,  <t  €  H. 


nh{s)  =  ^Vn(s-<ry^^V^(«-<r)  +  op(l) 


(4.34) 


(4.35) 


where  axe  the  characteristic  roots  of 


[2(1  +  *)(£  ®  E)  +  KV<r[) 


and  \i  denotes  x2  with  1  d.f. 

Suppose  h  is  scale  invariant;  that  is, 


(4.36) 


h{c8 )  =  fc(«),  Vc  >  0,  V*  p.d. 


(4.37) 


Then 


n  =  d2h(cs)  _  r2 

d<?  dads ' 


(4.38) 
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For  c  as  1  we  obtain 


m.0 

"dad*  ~  °' 


Here  */<  axe  the  characteristic  roots  of 


(4.39) 


(4.40) 


-  2  log  LRC  =  n/»(a)  4  £  „<x?  (4.41) 

I 

under  normality,  then  for  an  elliptically  contoured  distribution  with  kurtosis  k 

-2  log  LRC  ±  ^  2  (4.42) 


1  +  K 


5Z  **X«  • 


Example.  Sphericity.  Consider  the  null  hypothesis 


H  :  A  =  const  J, 


Under  normality  the  likelihood  ratio  criterion  is 


LRC  = 


(f)’ 


which  is  clearly  scale  invariant,  and 


-  2  log  LRC  =  n\p log(trS)  -  log  |5|  -  plogp]. 


(4.43) 


(4.44) 


(4.45) 


Then 


—2  log  LRC  a  2 
- — - ►  x., 

1  +  K  ' 


(4.46) 


where  the  number  of  degrees  of  freedom  is  /  =  \p(p  +  1)  —  1. 

Many  hypotheses  in  multivariate  analysis  axe  invariant  with  respect  to  some  group 
of  linear  transformations.  For  example,  the  hypothesis  (4.43)  is  invariant  with  respect 
to  transformations  X  — ►  cQX,  where  Q  is  orthogonal.  If  the  group  of  transforma¬ 
tions  includes  multiplication  by  a  constant,  the  likelihood  ratio  criterion  will  satisfy 
(4.37). 
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Tyler  (1983)  has  an  alternative  approach  to  testing  hypotheses.  Suppose  a  null 
hypothesis  is  defined  by  k(<r)  =  0,  where  k(-)  has  q  components  and  satisfies  the 
usual  regularity  condtions  and  (4.21).  A  Wald  test  can  be  based  on 

_  _ 1/-V-1 

n 


1  +  K 


*(»)'  {^(s  ®  s)  li£r]  }  *=(*)•  <4-47) 

Tyler  showed  that  this  statistic  has  a  limiting  ^-distribution  under  the  null  hypoth¬ 
esis.  A  function  that  is  asymptotically  equivalent  to  (4.47)  is 

=  }  *(*)’  (4-48) 

which  satisfies  (4.34). 


4.5.  Estimation  of  the  kurtosis  parameter. 

To  apply  the  large-sample  distribution  theory  derived  for  normal  distributions  to 
problems  of  inference  for  elliptically  contoured  distributions  it  is  necessary  to  know 
or  estimate  the  kurtosis  parameter  k.  Note  that 

e[(x-tJL)'i:-'(x-n)}*  =  s(Y'Y)2 

=  PSY\  +  p(p  -  l)(SY2)2 

=  p(3n  +  p  +  2).  (4.49) 


We  see  that 

M  =  4  -xYS-\xa-x)]2 

iY  0=1 

p(3k  -+  p  -f  2)  (4.50) 

and 

M-P(P  +  V  i  K.  (4.51) 

3  P 

Mardia  (1970)  proposed  the  left-hand  side  of  (4.51),  say  «,  as  a  consistent  estimator 
of  k.  The  convergence  in  (4.25)  and  (4.42)  is  valid  when  k  is  replaced  by  the  estimator 
k. 
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5.  Estimation  of  Covariance  Parameters 


5.1.  Maximum  likelihood  estimation 

We  have  considered  using  5asan  estimator  of  S  =  (£R2/p)A.  When  the  parent 
distribution  is  normal,  5  is  the  sufficient  statistic  invariant  with  respect  to  translations 
and  hence  is  the  efficient  unbiased  estimator.  Now  we  study  other  estimators. 

We  consider  first  the  maximum  likelihood  estimators  of  fi  and  A  when  the  form 
of  the  density  g(-)  is  known.  The  logarithm  of  the  likelihood  function  is 


log  I  =  log  \A\  +  £  log  g\[xa  -  n)'A~l{xa  -  n)).  (5.1) 

Z  0  =  1 

The  derivatives  of  log  L  with  respect  to  the  components  of  \l  are 

Setting  this  vector  of  derivatives  to  0  leads  to  the  equation 

y*  ~  fi)'A~\xa  9^Xa  ~  Llifi  ~  M  (53) 

o=l  <7[(*a  -  il)'A  \xa  -  /x)]  “=1  g{{* o  -  tl)'A~\xa  -  £)] 

Setting  to  0  the  derivatives  of  logZ,  with  respect  to  the  elements  of  A-1  gives 
;  2  g'[{xa  -  ii)'A~l{xa  -  £)],  .w  . 

A  =  ~n  ^  {Xa  ~  **){Xa ' /i)  •  (5-4) 

7  0=1  g[{xa  -  n)'A  (xa  -  n)\ 

The  estimator  A  is  a  kind  of  weighted  average  of  the  rank  1  matrices  ( xa  —  /r)(xQ  —  £)'. 
In  the  normal  case  (y(y)  given  by  (4.3)]  the  weights  are  1/N.  In  most  cases  (5.3)  and 
(5.4)  cannot  be  solved  explicitly,  but  the  solution  may  be  approximated  by  iterative 
methods. 

The  covariance  matrix  of  the  limiting  normal  distribution  of  \/N(vec  A  —  vec  A) 


Cov(vecA)  =  <7ifl(/ps  -f  KPP)(A  <g>  A)  +  cr2SvecA(vecA)', 


where 


= 


P(P  +  2) 
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2fflg{\  -  (Tig) 

2+ki  -«,)• 


(5.7) 


See  Tyler  (1982). 

Example.  Multivariate  t.  If  the  density  g(y'y)  is  given  by  (3.36),  then 


*\9  = 


p  +  m  +  2 
p  +  m  ’ 


(5.8) 


Note  that  1  +  k  =  (m  —  2)/(m  —  4);  that  is,  /c  =  2/(m  —  4).  As  m  — ►  oo,  /c  — »  0, 
crl9  — ♦  (p  +  2)/p,  and  cr2j  — ♦  0;  these  are  the  values  for  jV(^,  27). 


5.2.  Robust  estimators. 


Maronna  (1976) 


has  studied  robust  estimators  or  M-estimators.  Set 


■£  =  (*.-  A)'V-(*.  -  A)' 


(5.9) 


for  a  vector  ft  and  a  positive  definite  matrix  V.  Suppose  that  fi  and  V  also  satisfy 

i{da)(xa  -  fr)  =  0,  (5.10) 

iV  0=1 

(d^)(*a  - /i)(*0  -  ii)' =  V  (5.11) 

0=1 

for  ui(d)  and  u2 (dP)  nonnegative,  nondecreasing,  and  continuous  for  d  >  0  such  that 
dui(d)  and  <fau2(da)  are  bounded.  [Maronna  (1976)  gives  two  other  conditions  on 
ui(-)  and  t*a(-).]  Then  fi  estimates  pi  =  SX  and  V  estimates  =  17,  say,  where  7 
satisfies 

SlR2U2(iR3)  =  p.  (5-12) 


These  estimators  have  an  asymptotic  normal  distribution.  The  covariance  matrix  of 
the  limiting  distribution  of  \/N[vec  V  —  vec  17]  has  the  same  form  as  (4.17)  and  (5.5); 
it  is 

<ri„(/p2  +  -KppK#  ®  17)  +  cr2u  vec  l?(vec  1?)',  (5.13) 

where 


_  (p  +  2)Vi  _  jh  -  1  0^i(^a  ~  l)[(p  + 4)^>2  -t-p] 

U  (2V»1  +  P)2  ’  2“  V>2  ^2(202  +  p)2 


(5.14) 
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*' - RF+2)  ’ 

7gi?2M7#)+ 7*^(7#)] 


(5.15) 

(5.16) 


See  Tyler  (1982).  Note  that  if  in  (5.13)  we  replace  V  by  7  V  and  17  by  7 17  =  A,  the 
coefficients  <7iu  and  <72u  are  unchanged. 

Tyler  (1983)  has  given  a  table  of  values  of  <7\  (=  erlfl  or  <riu),  cr2  (=  <r2ff  or  cr2u), 
and  7  for  several  estimators  and  several  elliptically  contoured  distributions;  part  of 
his  table  is  reproduced  as  Table  1  below.  The  estimators  include  maximum  likelihood 
for  the  multivariate  t-distribution  with  1  and  5  degrees  of  freedom  [ML:T(1)  and 
ML:T(5)]  and  Huber-type  estimates  with  Uj(-)  and  u2(-)  defined  by 


«i(«0 


=  {!' 

={  I; 

\ 


d  <  r, 
d>  r, 

cP  <  r 
dP  >  r 


(5.17) 


(5.18) 


A  Huber-type  estimator  is  denoted  by  HUB(g),  where  q  =  Pr{Xp  >  r}  and  /3  is 
determined  by  Sxlu3(xl)  —  P •  The  distributions  are  the  multivariate  ^-distributions 
with  1  and  5  degrees  of  freedom,  the  contaminated  normal  (CN)  with  e  =  .1  and 
c  =  9,  and  the  normal. 


Table  1 

p  =  2 


T(  1) 

T(  5) 

CN 

Normal 

<7i 

<r2 

7 

<T\ 

(72 

7 

a2 

7 

cri 

cr2 

7 

ML:T(1) 

1.67 

3.33 

1.00 

1.48 

0.89 

1.75 

1.45 

0.69 

1.73 

1.43 

0.46 

2.02 

ML:T(5) 

2.28 

5.84 

0.28 

1.29 

0.51 

1.00 

1.28 

0.51 

1.03 

1.11 

0.05 

1.31 

HUB(.5) 

1.70 

3.73 

0.65 

1.50 

1.41 

0.92 

1.46 

1.10 

0.89 

1.44 

0.98 

1.00 

HUB(.l) 

2.15 

5.10 

0.29 

1.32 

0.57 

0.78 

1.23 

0.36 

0.83 

1.09 

0.09 

1.00 

5 

00 

00 

- 

3.00 

2.00 

0.60 

2.77 

1.77 

0.56 

1.00 

0.00 

1.09 
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P  = 

=  10 

0\ 

<72 

7 

o\ 

<t2 

7 

o\ 

<r2 

7 

0\ 

o-i 

7 

ML:T(1) 

1.18 

2.37 

1.10 

1.17 

0.50 

1.17 

1.16 

0.18 

1.10 

1.16 

0.05 

1.22 

ML:T(5) 

1.28 

2.89 

0.60 

1.13 

0.45 

1.00 

1.11 

0.22 

1.00 

1.09 

0.02 

1.15 

HUB  (.5) 

1.23 

2.68 

1.09 

1.15 

0.63 

1.07 

1.09 

0.16 

0.95 

1.08 

0.12 

1.00 

HUB  (.1) 

1.48 

3.55 

0.50 

1.21 

0.50 

0.85 

1.07 

0.11 

0.91 

1.01 

0.01 

1.08 

5 

oo 

oo 

— 

3.00 

2.00 

0.60 

2.77 

1.77 

0.56 

1.00 

0.00 

1.00 

The  two  values,  ax  and  <t2,  for  the  maximum  likelihood  estimator  ML:T(1)  are  the 
smallest  for  the  distribution  T(l)  although  the  values  for  HUB(.5)  are  only  slightly 
larger.  Similarly,  the  values  for  ML:T(5)  are  the  smallest  for  T(5),  but  the  values 
for  ML:T(1)  and  HUB(.l)  are  close.  The  values  for  HUB(.l)  are  smallest  for  CN.  Of 
course,  S  is  best  for  the  normal  and  HUB(.l)  is  close.  S  is  not  a  valid  estimator  for 
T(l)  because  the  second  moment  of  X  does  not  exist,  and  S  is  not  accurate  for  T(5) 
and  CN.  We  see  that  5  is  not  a  very  robust  estimator. 


6.  Spherical  Matrix  Distributions 

The  observations  *1 ,  •  •  • ,  constitute  an  IV  x  p  matrix 


X  = 


L  */v 


(6.1) 


Consider  a n  N  x  p  random  matrix  Y .  We  define  the  following  classes  of  matrices: 


Left-spherical 

Qky  4y 

VQ„, 

(6.2) 

Right-spherical 

YQr  =  Y 

v<?„, 

(6.3) 

Vector-spherical 

QNpvecY  =  vec  Y 

(6.4) 
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where  Qm  denotes  an  orthogonal  matrix  of  order  m. 

If  Y  is  vector-spherical  and  has  a  density,  it  is  also  left-spherical  and  right-spherical 
and  Y '  is  also  vector-spherical  because  the  density  has  the  form 


S[(vecl7vecy] 


g(  tr  Y'Y)  =  5[(  vec  YJ  vec  Y'], 


(6.5) 


An  example  is  the  case  of  all  of  the  elements  of  Y  being  independent  N(0, 1)  variables; 
in  that  case  g(w)  —  (2ir)~1,N^2e~w^2. 

Define 

X  =  YA'  +  eN»',  (6.6) 

where  AA!  =  A  and  =  (1,  *  *  * ,  1).  Since  (6.6)  is  equivalent  to  K  =  (X  - 
e/v/i'XA.')-1,  and  (A')-1  A-1  =  A~*,X  has  the  density 


IAr"''gltT(X  -  ",S)'A-'(X  -«w0]-  |A|-W',S  B*.  -  WHX.  -  m)  • 

a— 1 


From  (6.5)  we  deduce  that  vecy  has  the  representation 


vec  y  =  R  vec  U , 


(6.8) 


where  w  =  R2  has  the  density 


T(Np/2) 


JNP~' 


(6.9) 


vec  U  has  the  uniform  distribution  on  UL  =  1»  and  R  and  vec  ^  are 

independent.  The  covariance  matrix  of  vec  Y  is 


CD  2  CD2 

£  vec  Y-  (vec  V)'  =  -J^In,  =  ®  In), 


(6.10) 


Since  vec  FGH  =  ( H '  ®  F)vecG  for  any  conformable  matrices  F,G,  and  H,  we 
can  write  (6.6)  as 


vec  X  =  ( A  ®  In)  vec  Y  -f  pt  ®  es- 


(6.11) 


Thus 


£  vecX 

(6.12) 

Cov  (vecX) 

=  (A  ®  /w)cov(vecY)(A#  0  JN) 

=  A®  Is, 

(6.13) 

£(  row  of  X) 

=  M\ 

(6.14) 

Cov(tow  of  X) 

n 

(6.15) 

The  rows  of  X  are  uncorrelated  (though  not  necessarily  independent).  From  (6.11) 
we  obtain 

vecX  =  R(A  ®  Itf)vecU  +  /i  ®  £jv,  (6.16) 

X  =  RUA'  +  esn'-  (6.17) 

Since  X  —  cnh'  —  (X  —  esx')  +  es(x  —  /i)',  we  can  write  the  density  of  X  as 

|^|-/v/V[tr^-1(X  -  c*®')'(X  -  cNx')  +  N(x  -  f»YA-l{*  -  /i)],  (6.18) 

where  i  =  (l/N)X'cy.  This  shows  that  a  sufficient  set  of  statistics  for  n  and  A  is 
x  and  nS  =  (X  —  e^x')'(X  —  esx')  as  for  the  normal  distribution. 

The  maximum  likelihood  estimators  of  n  and  A  are 


A  =  *, 

(6.19) 

A  £(*»  -*)(*•-  *)'■ 

“V  0  =  1 

(6.20) 

where  maximizes  xvNpl2g(w)  [Anderson,  Fang,  and  Hsu  (1986)]. 

Note  that  A  is 

the  same  estimator  as  for  the  normal  and  A  is  a  multiple  of  the  estimator  of  A  in  the 

normal  case. 

Theorem.  Let  /(X)  be  a  vector- valued  function  of  X  such  that 

/(X  +  e„i/)  =  /(X)  Vi/, 

(6.21) 

and 

/(cX)  =  /(X)  Veto. 

(6.22) 
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Then  the  distribution  of  f(X ),  where  X  has  the  arbitrary  density  (6.7),  is  the  same 
as  the  distribution  of  f(X),  where  X  has  the  normal  density  (6.7). 

Proof.  Substitution  of  the  representation  (6.11)  into  f(X)  gives 


f(X)  =  f[R(A  ®  J* )ve c  U  +  M®eAf] 

(6.23) 

=  f[R( A  ®  /jv)vec  U) 

(6.24) 

by  (6.20)  and 

/(X)  =  /[(A®  Jn)vect7] 

(6.25) 

by  (6.21). 

□ 

Any  statistic  satisfying  (6.20)  and  (6.21)  has  the  same  distribution  for  all  </(•). 
Hence,  if  its  distribution  is  known  for  the  normal  case,  the  distribution  is  valid  for  all 
elliptically  contoured  distributions. 


Anderson  and  Fang  (1990b)  gave  the  examples  of  the  correlation  coefficients  and 
the  multiple  correlation  coefficient.  They  also  showed  that  when  fi  =  0  the  distri¬ 
bution  of  Hotelling’s  T 2  =  Nx'S~lx  does  not  depend  on  g(-).  Any  likelihood  ratio 
criterion  under  normality  that  is  scale  invariant  and  location-invariant  in  the  sense  of 
(6.20)  has  the  same  distribution  for  g(-).  The  sphericity  criterion  is  an  example. 

Any  function  of  the  sufficient  set  of  statistics  that  is  translation  invariant,  that  is, 
that  satisfies  (6.20),  is  a  function  of  S.  Thus  inference  concerning  S  can  be  based 
on  S. 

Anderson,  Fang,  and  Hsu  (1986)  considered  the  likelihood  ratio  criterion  for  test¬ 
ing  the  nul1  hypothesis  (#z,/l)  €  u  in  the  model  {/x,A)  G  fi.  Suppose  (n,A)  €  u> 
implies  (/z,  cA)  €  u>  and  (/z,  /l)  €  fi  implies  (jz,  cA)  €  fi  V  c  >  0.  Then  the  likelihood 
ratio  criterion  for  arbitrary  g(-)  is  the  same  as  the  LRC  for  normal  g(-). 

Suppose  further  that  u>  =  um  xwi,fi  =  fim  x  fi/,  /Zj,  /z2  G  implies  /z2— jZj  G  <*>m, 
Mi,  M2  €  fim  implies  M2  ~  Mi  €  fi,  M  €  u;m  implies  cfi  G  u>m ,  and  fj,  G  fim  implies 
c/z  G  fim  V  c.  Then  if  the  distribution  of  the  LRC  does  not  depend  on  (fJt,A)  under 
normality,  it  does  not  depend  on  g(-)  or  on  (/z,/i).  Anderson,  Fang,  and  Hsu  gave 
several  examples  including  the  test  for  lack  of  correlr/ion  between  sets  of  variates. 
They  pointed  out  that  the  result  also  applies  to  tests  of  equality  of  several  covariance 
matrices. 
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This  class  of  vector  elliptically  contoured  distributions  shows  that  the  sampling 
theory  for  the  normal  distribution  is  valid  for  a  much  wider  class  of  distributions. 
Several  papers  in  Fang  and  Anderson  (1990)  show  that  many  properties  of  the  normal 
can  be  extended  to  this  class.  The  disadvantage  of  these  models  is  that  except  for 
the  normal  the  observations  are  dependent,  though  uncorrelated.  The  advantage  is 
that  the  similarity  to  the  normal  is  exact  rather  than  asymptotic. 

7.  References 

Anderson,  T.  W.  (1984).  An  Introduction  to  Multivariate  Statistical  Analysis,  John 
Wiley  &  Sons,  New  York. 

Anderson,  T.  W.,  and  Kai-Tai  Fang  (1990a),  On  the  theory  of  multivariate  elliptically 
contoured  distributions  and  their  applications,  in  Statistical  Inference  in  Ellipti¬ 
cally  Contoured  and  Related  Distributions  (Kai-Tai  Fang  and  T.  W.  Anderson, 
eds.),  Allerton  Press,  Inc.,  New  York,  1990,  1-23. 

Anderson,  T.  W.,  and  Kai-Tai  Fang  (1990b),  Inference  in  multivariate  elliptically 
contoured  distributions  based  on  maximum  likelihood,  in  Statistical  Inference  in 
Elliptically  Contoured  and  Related  Distributions  (Kai-Tai  Fang  and  T.  W.  Ander¬ 
son,  eds.),  Allerton  Press,  Inc.,  New  York,  1990,  201-216. 

Anderson,  T.  W.,  Kai-Tai  Fang,  and  Huang  Hsu  (1986),  Maximum-likelihood  esti¬ 
mates  and  likelihood  ratio  criteria  for  multivariate  elliptically  contoured  distribu¬ 
tions,  The  Canadian  Journal  of  Statistics,  14,  55-59. 

Bartlett,  M.  S.  (1934).  The  vector  representation  of  a  sample.  Proceedings  of  the 
Cambridge  Philosophical  Society,  30,  327-340. 

Chmielewski,  M.  A.  (1981).  Elliptically  symmetric  distributions:  A  review  and  bib¬ 
liography.  International  Statistical  Review,  49,  67-74. 

Fang,  Kai-Tai,  and  T.  W.  Anderson,  eds.  (1990).  Statistical  Inference  in  Elliptically 
Contoured  and  Related  Distributions,  Allerton  Press,  Inc.,  New  York. 

Fang,  Kai-Tai,  and  Yao-Ting  Zhang  (1990)  Generalized  Mutivariate  Analysis,  Springer- 
Verlag,  Berlin. 


Hartman,  P.,  and  A.  Wintner  (1940).  On  the  spherical  approach  to  the  normal 
distribution  law.  American  Journal  of  Mathematics ,  62,  759-779. 

Kelker,  Douglas  (1970),  Distribution  theory  of  spherical  distributions  and  a  location 
scale  parameter  generalization,  Sankhyd,  Series  A,  32,  419-430. 

Magnus,  J.  R.,  and  H.  Neudecker  (1979).  The  commutation  matrix:  some  properties 
and  applications  Annals  of  Statistics  7,  381-394. 

Mardia,  K.  V.  (1970).  Measures  of  multivariate  skewness  and  kurtosis  with  applica¬ 
tions,  Biometrika,  57,  519-530. 

Maronna,  Ricardo  Antonio  (1976).  Robust  M-estimators  of  multivariate  location  and 
scatter,  Annals  of  Statistics,  4,  51-67. 

Maxwell,  J.  C.  (1860).  Illustration  of  the  dynamical  theory  of  gases  -  Part  I.  On  the 
motions  and  collisions  of  perfectly  elastic  bodies.  Tayler’s  philosophical  Magazine. 
19,  19-32. 

Muirhead,  Robb  J.,  and  C.  M.  Waternaux  (1980).  Asympototic  distributions  in 
canonical  correlation  analysis  and  other  multivariate  procedures  for  nonnormal 
populations,  Biometrika,  37,  31-43. 

Tyler,  David  E.  (1982).  Radial  estimates  and  the  test  for  sphericity,  Biometrika,  69, 
429-436. 

Tyler,  David  E.  (1983).  Robustness  and  efficiency  properties  of  scatter  matrices. 
Biometrika,  70,  411-420. 


27 


REPORT  DOCUMENTATION  PAGE 

form  Approved 

0MB  No.  0/04-0188 

►*jo*k  'poomrc  ouraen  »o *  x*>\  c-recrion  ct  -•t-ivation  <\  esttmeteo  to  jveraq*  '■  -our  o*t  •soors*  i*c'Ldirq  tre  tim*  »op  -eviwitnq  instructions.  icarcnmq  e«tsn«q  data  wu/cn 
;4tn*f«nq  *oa  .-naintammq  me  data  neeaeo  *■«  :cfroi*tma  snq  reviewing  me  collection  ot information  >enq  commemt  reqerqmq  tm$  ouraen  estimate  or  anv  otner  asoact  o»  tm$ 
collection  cf  information,  -nciugmq  suggestions  ’or  reducing  tint  ouraen  to  n/asnmqton  «f*oouiren  Service*.  Qirectorete  tor  m*ormation  Ooeretion*  and  *wom.  W  >5  .etterson 
OavuMiqr^av.  Suae  '204  Arlington.  ,2;;; -*102  *nd  to t«e  Otticec*  Management  and  dudqet  p»oerwor«  ^eduction  Protect  (0704-0 188).  Wasnmgton.  GC  :3503. 

1.  AGENCY  USE  ONLY  Heave  onmO  2.  REPORT  OATE  3.  REPORT  TYPE  AND  OATES  COVERED 

July  1992  Technical  Report 

4.  TITLE  AND  SUBTITLE 

Nonnormal  Multivariate  Distributions : Inference 
Based  on  Elliptically  Contoured  Distributions 

S.  FUNDING  NUMBERS 

AR0 

DAAL03-89-K-0035 

6.  AUTHOR(S) 

T.  W.  Anderson 

7.  PERFORMING  ORGANIZATION  NAME(S)  ANO  AOORESS(ES) 

Department  of  Statistics 

Sequoia  Hall 

Stanford  University 

Stanford,  CA  94305-4065 

B.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 

28 

9.  SPONSORING /MONITORING  AGENCY  NAME(S)  ANO  AOORESS(ES) 

U.  S.  Army  Research  Office 

P.  0.  Box  12211 

Research  Triangle  Park,  NC  27709-2211 

10.  SPONSORING /MONITORING 

AGENCY  REPORT  NUMBER 

28 

11.  SUPPLEMENTARY  NOTES 

The  view,  opinions  and/or  findings  contained  in  this  repoi 
author (s)  and  should  not  be  construed  as  an  official  Depai 
position,  policy,  or  decision,  unless  so  designated  by  ot 

rt  are  chose  of  Che 
rtment  of  Che  Army 
ler  documentation. 

12a.  DISTRIBUTION /AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited. 

12b.  DISTRIBUTION  CODE 

13.  ABSTRACT  (Maximum  200  worat) 

The  class  of  elliptically  contoured  distributions,  which  includes  multivariate  t- 
distributions  and  contaminated  normal  distributions,  serves  as  a  useful  generalization 
of  the  class  of  normal  multivariate  distributions.  The  density,  marginal  and  condi¬ 
tional  densities,  and  moments  of  an  elliptically  contoured  distribution  are  related  in  a 
simple  fashion  to  those  of  a  normal  distribution.  The  asymptotic  normal  distributions 
of  the  sample  mean  and  covariance  matrix  are  developed  and  are  compared  with  the 
asymptotic  distributions  of  the  maximum  likelihood  estimators  of  the  parameters  of 
an  elliptically  contoured  distribution.  The  class  of  elliptically  contoured  distributions 
serves  as  a  model  for  evaluating  other  robust  estimators.  Many  test  procedures  for 
normal  distributions  are  easily  modified  for  the  elliptically  contoured  distributions. 

Further  generalizations  are  discussed. 

14.  SUBJECT  TERMS 

Nonnormal  multivariate  distributions,  elliptically 
contoured  d i s t r i bu t i on s , mu  1 1 i var i at e  t-di stribut ions , 
robust  estimators. 

17.  SECURITY  CLASSIFICATION  18.  SECURITY  CLASSIFICATION 

OF  REPORT  OF  THIS  PAGE 

UNCLASSIFIED  UNCLASSIFIED 

19.  SECURITY  CLASSIFICATION 

OF  ABSTRACT 

UNCLASSIFIED 

20.  LIMITATION  OF  ABSTRACT 

UL 

MSN  7540-01 -280-5500 


Standard  Form  298  {R«v  2-89> 

VrmcnbM  bv  AMU  Sttf  *19- »• 

298- 102 


